ABSTRACT

This project is the design of word cloud analysis program based on web crawler. Taking Jingdong Mall as the platform, it crawls all comment information of designated products, conducts data cleaning and data analysis on the information obtained from the review and crawler, and generates word cloud map.At the same time, the visual analysis of review data can clearly show the advantages and disadvantages of customer-centered evaluations and commodities, and provide an important reference for consumers to choose commodities and businesses to improve decision-making and optimize services. This project is developed by using Python3 language, using PyChart as the IDE, using the requests library, JSON library, World Cloud library and PyMongo library, using Navicat to connect MongoDB, using PyQT5 library to achieve visual interface, and JavaScript+HTML5+ CSS3 +MySQL+ word cloud + Boozing and Bagging algorithm for data analysis and algorithm optimization. In addition to providing consumers with cost-effective, highly evaluated and highly rated goods, it also provides the sellers with more specific data to improve their own defects.

Keywords: - Jingdong crawler; Natural Language Processing; Data mining; Visualization